# Play the chunk above and this one to get the data into your Console
View(Friendly)
?Friendly


Background

Many teachers and other educators are interested in understanding how to best deliver new content to students. In general, they have two choices of how to do this.

  1. The Meshed Approach
    • Deliver new content while simultaneously reviewing previously understood content.
  2. The Before Approach
    • Deliver new content after fully reviewing previously understood content.

A study was performed to determine whether the Meshed or Before approaches to delivering content had any positive benefits on memory recall.

The Experiment (click to view)
The Data (click to view)


Intro/Context

This analysis will measure the positive impact on memory recall based on two different testing methods the Meshed and Before (See Background info above for details about the two different methods). We will conduct one parametric test and one non-parametric. The level of significance for this analysis will be α = 0.05.

Ho/Ha (parametric test)

Independent Samples T-test

\[ H_0: \mu_\text{Before} - \mu_\text{Meshed} = 0 \]

\[ H_a: \mu_\text{Before} - \mu_\text{Meshed} \neq 0 \]

Friendly1 <- filter(Friendly, condition %in% c("Before", "Meshed")) |> 
  droplevels()

Graphical summary

ggplotly(ggplot(Friendly1, aes(x=condition, y=correct)) +
         geom_boxplot(fill=c("dodgerblue", "darkorange"), color="black")+
  labs(title="Before and Meshed Method Comparison", x="Testing Methods", y="Number of Correct Words"))

This boxplot shows the results for the Before and Meshed method recalling word tests. The minimum result in the before test graph shows that it was 24 correct words while the minimum score for the Meshed method was 30 correct words, this might make a difference when it comes to using the mean for the parametric method test that we will use in this analysis, but we can only know that fpr sure after we do the test calculations.

Numerical summary

favstats(correct ~ condition, data = Friendly1) |> 
pander()
condition min Q1 median Q3 max mean sd n missing
Before 24 37.25 39 39.75 40 36.6 5.337 10 0
Meshed 30 36 36.5 38.75 40 36.6 3.026 10 0

This table shows what was presented graphically in the boxplot above, but here we can easily compare the result of each testing method. The mean is the same for both tests which will result in no difference of means for the Independent Samples T-test that will see later in this analysis. We can clearly see a difference of median being 39 correct words for the Before test and 36.5 correct words for the Meshed method that will cause to fail to reject the null hypotheses of the difference of median for the non-parametric test. We will see if this is right after we do the non-parametric and parametric test calcualtions below.

Parametric Test

Independent Samples T-test

t.test(correct ~ condition, data=Friendly1, mu=0, alternate = "two.sided") |> 
  pander(caption="Independent Samples t-test")
Independent Samples t-test (continued below)
Test statistic df P value Alternative hypothesis
0 14.24 1 two.sided
mean in group Before mean in group Meshed
36.6 36.6

Requirements for the parametric test (qq-plot)

qqPlot(correct ~ condition, data=Friendly1)

As we can see in the two plots above, there are some variables (points 16, 17, and 30 on the right graph) that show that the data is not normally distributed, and the sample distribution of the sample of mean is not normal. Since the data doesn’t look normally distributed, an independent samples t-test can not be performed to prove that there is a significant difference between the groups.The most appropiate test for this data samples would be a non-parametric test.

Ho/Ha non-parametric test

\[ H_0: \text{Median of Differences} = 0 \]

\[ H_a: \text{Median of Differences} \neq 0 \]

Non-parametric test

wilcox.test(correct ~ condition, data=Friendly1, mu=0, alternate = "two.sided") |> 
  pander(caption="Wilcoxon Rank Sum Test")
Wilcoxon Rank Sum Test
Test statistic P value Alternative hypothesis
62 0.378 two.sided

Conclusions

Parametric Independent Samples t-test

For this test we obtained a p-value of 1 which is higher than the level of significance used for this test α = 0.05. We conclude that we fail to reject the null hypotheses and we reject the alternative hypotheses. We have insufficient evidence to reject the null hypotheses. The mean 36.6 of correct words is the same for the Before testing method and the Meshed testing one. They both have the same positive impact on memory recall based on this kind of t-test.

Non-Parametric Wilcoxon Rank Sum (Mann-Whitney) Test

For this test we obtained a p-value of 0.378 which is lower than the significance level used for this test α = 0.05. We conclude that we reject the null hypotheses accepting the alternative hypotheses. We have sufficient evidence to reject the null hypotheses. The median of the Before testing method 39 is different than the median for the Meshed testing method 36.5.

Future Studies

We could two other different tests using the data for this analysis where we compare the Before method to the Standard Free Recall (SFR) and another test evaluating the Meshed method with the Standard Free Recall (SFR). Having both tests unders the same condition we could see which method have better overall positive impact on memory recall.